Introduction to Distributional Regression

Practical session 3

Gillian Heller

NHMRC Clinical Trials Centre, University of Sydney

gillian.heller@sydney.edu.au

Practical session 3

Mixed distributions

  • mixed discrete-continuous

  • continuous with probability mass at discrete point(s)

  • aka semi-continuous


Zero-adjusted distributions

  • On \mathbb{R}_+ with a probability spike at zero

    • daily rainfall

    • amount of insurance claim in a year

    • annual medical expenditure

f(y \mid \boldsymbol{\theta}, \pi)= \begin{cases}\pi & \text { if } y=0 \\ (1-\pi) f_c(y \mid \boldsymbol{\theta}) & \text { if } y>0\end{cases}

Zero- and/or one-inflated distributions

  • On \mathbb{R}_{(0,1)} with a probability spike(s) at zero and/or one


Example: pain

  • A patient-reported measure of pain is the Visual Analogue Scale (VAS)

  • Subjects are given a linear scale and asked to put a mark where they perceive themselves

  • Responses are measured and scaled to [0, 1]

  • While 1 is not frequently recorded, 0 is common

f\left(y \mid \boldsymbol{\theta}, \pi_0, \pi_1\right)= \begin{cases}\pi_0 & \text { if } y=0 \\ \left(1-\pi_0-\pi_1\right) f_c(y \mid \boldsymbol{\theta}) & \text { if } 0<y<1 \\ \pi_1 & \text { if } y=1\end{cases}

Mixed distributions in GAMLSS


Distribution GAMLSS name Parameters
zero-adjusted inverse Gaussian ZAIG \mu, \sigma, \nu
zero-adjusted Gamma ZAGA \mu, \sigma, \nu
zero-inflated beta BEINF0, BEZI \mu, \sigma, \nu
one-inflated beta BEINF1, BEOI \mu, \sigma, \nu
zero- and one-inflated beta BEINF \mu, \sigma, \nu, \tau


  • Zero-adjusted and zero-and/or one-inflated versions of other continuous distributions can be user-defined

Speech intelligibility testing

  • Chapter 9 of Stasinopoulos et al. (2024), Hu et al. (2015)

  • Speech intelligibility tests are conducted on hearing-impaired people, for the purpose of evaluating the performance of hearing devices under controlled listening conditions in a sound laboratory.

  • The subject listens to a short sentence (of length N words), mixed with noise, and repeats it back.

  • The number of correctly identified words: w

  • Response: proportion correct y=w/N

    • An obvious model for the proportion correct is the binomial distribution; however, the assumption of independent recognition of words in a sentence is unlikely to be satisfied.

Speech intelligibility testing

data(speech)
speech <- read.csv("Data/speech.csv")
ggplot(speech, aes(x=y)) +
  geom_histogram(binwidth = 0.04, colour="darkblue", fill="lightblue") +
  xlab("Proportion correct") + 
  ylab("Frequency") + 
  theme_bw(base_size=14)

The data


Variable Description
N number of words in sentence
w number of words correctly identified
y proportion correct (w/N)
snr signal-to-noise ratio
algorithm sound processing algorithm (A, B, C)
noise_dir noise direction (F = front, S = side)
noise_gender noise gender (M, F)
subject subject identifier

Response distribution

  • Bounded continuous

  • Zero- and one-inflated

  • Recall: lecture 2, slides 31-32

  • If f_c(y) is e.g. the Beta distribution which has 2 parameters \mu and \sigma

f\left(y \mid \mu, \sigma, \pi_0, \pi_1\right)= \begin{cases}\pi_0 & \text { if } y=0 \\ \left(1-\pi_0-\pi_1\right) f_c(y \mid \mu, \sigma) & \text { if } 0<y<1 \\ \pi_1 & \text { if } y=1\end{cases}

Response distribution

In estimating the model, we have to respect the constraint

0<\hat\pi_{0i}+\hat\pi_{1i}<1\qquad\text{for }i=1,\ldots,n

  • This is difficult

Response distribution

  • More tractable parametrization:

f(y \mid \mu, \sigma, \xi_0, \xi_1)= \begin{cases}\frac{\xi_0}{1+\xi_0+\xi_1} & y=0 \\ \frac{1}{1+\xi_0+\xi_1} \cdot f_c(y \mid \mu, \sigma) & y \in(0,1) \\ \frac{\xi_1}{1+\xi_0+\xi_1} & y=1\end{cases}

where \mu \in(0,1), \sigma \in(0,1), \xi_0>0, \xi_1>0

Response distribution

  • We choose the distribution for 0<y<1, then add zero- and one-inflation

  • Use the gamlss function chooseDist() to pick the distribution for y\in(0,1):

speech.01 <- subset(speech, (y>0&y<1))

m1 <- gamlss(y~snr, 
             sigma.formula =~ snr, 
             nu.formula =~ snr, 
             tau.formula =~ snr, 
             family=BE, data=speech.01, n.cyc=100, trace=FALSE)

a <- chooseDist(m1, type="real0to1")  
minimum GAIC(k= 2 ) family: SIMPLEX 
minimum GAIC(k= 3.84 ) family: SIMPLEX 
minimum GAIC(k= 7.67 ) family: SIMPLEX 

Response distribution

Now create the zero- and one-inflated Simplex distribution

library(gamlss.inf)
gen.Inf0to1(family="SIMPLEX", type.of.Inflation = "Zero&One")
A  0to1 inflated SIMPLEX distribution has been generated 
 and saved under the names:  
 dSIMPLEXInf0to1 pSIMPLEXInf0to1 qSIMPLEXInf0to1 rSIMPLEXInf0to1 
 plotSIMPLEXInf0to1 
  • Simplex distribution has 2 parameters \mu and \sigma

  • Zero- and one-inflation adds

    • zero-inflation parameter \xi_0 = xi0

    • one-inflation parameter \xi_1 = xi1

  • To fit the model use the function gamlssInf0to1()

gamlssInf0to1(y = y, 
               mu.formula = ~1,sigma.formula = ~1, 
               nu.formula = ~1, tau.formula = ~1, 
               xi0.formula = ~1, xi1.formula = ~1, 
               data = data, family = SIMPLEX, ...)

Variable selection: Stepwise

m0 <- gamlssInf0to1(y = y, 
               mu.formula = ~ snr,
               sigma.formula = ~snr, 
               xi0.formula = ~ snr, xi1.formula = ~ snr,                     
               data=speech, family = SIMPLEX, n.cyc=50)

m.step <- stepGAICAll.A(m0, 
                scope=list(lower=~1, 
                           upper=~ snr + noise_dir + noise_gender + algorithm +
                             algorithm:snr + algorithm:noise_dir + algorithm:noise_gender))
  • Automated variable selection (stepwise, LASSO) doesn’t work (in current version of gamlss) for user-defined distributions
  • The model likelihood decomposes into
    • the continuous “middle” component 0<y<1
    • a multinomial component: y=0\,|\, 0<y<1\,|\, y=1
  • We can choose models for these components separately

Stepwise selection

1. Select model for 0<y<1

m01.0 <- gamlss(y~snr, 
             sigma.formula =~ snr, 
             family=SIMPLEX, data=speech.01, n.cyc=100, trace=FALSE)

m01.step <- stepGAICAll.A(m01.0, 
                scope=list(lower=~1, 
                           upper=~ snr + noise_dir + noise_gender + algorithm +
                             algorithm*snr + algorithm*noise_dir + algorithm*noise_gender))
--------------------------------------------------- 
Distribution parameter:  mu 
Start:  AIC= -924.71 
 y ~ snr 

               Df     AIC
+ noise_gender  1 -931.45
+ algorithm     2 -927.83
<none>            -924.71
+ noise_dir     1 -922.71

Step:  AIC= -931.45 
 y ~ snr + noise_gender 

            Df     AIC
+ algorithm  2 -934.12
<none>         -931.45
+ noise_dir  1 -929.45

Step:  AIC= -934.12 
 y ~ snr + noise_gender + algorithm 

                         Df     AIC
+ noise_gender:algorithm  2 -935.78
<none>                      -934.12
+ noise_dir               1 -932.13
+ snr:algorithm           2 -931.59

Step:  AIC= -935.78 
 y ~ snr + noise_gender + algorithm + noise_gender:algorithm 

                Df     AIC
<none>             -935.78
+ noise_dir      1 -933.78
+ snr:algorithm  2 -932.63
--------------------------------------------------- 
Distribution parameter:  sigma 
Start:  AIC= -935.78 
 ~snr 

               Df     AIC
- snr           1 -937.35
<none>            -935.78
+ noise_dir     1 -934.51
+ noise_gender  1 -934.14
+ algorithm     2 -932.11

Step:  AIC= -937.35 
 ~1 

               Df     AIC
<none>            -937.35
+ noise_dir     1 -936.08
+ noise_gender  1 -935.89
+ snr           1 -935.78
+ algorithm     2 -933.72
--------------------------------------------------- 
Distribution parameter:  mu 
Start:  AIC= -937.35 
 y ~ snr + noise_gender + algorithm + noise_gender:algorithm 

                         Df     AIC
<none>                      -937.35
- noise_gender:algorithm  2 -935.70
- snr                     1 -867.35
--------------------------------------------------- 

Stepwise selection

1. Select model for 0<y<1

summary(m01.step)
******************************************************************
Family:  c("SIMPLEX", "Simplex") 

Call:  gamlss(formula = y ~ snr + noise_gender + algorithm +  
    noise_gender:algorithm, sigma.formula = ~1, family = SIMPLEX,  
    data = speech.01, n.cyc = 100, trace = FALSE) 

Fitting method: RS() 

------------------------------------------------------------------
Mu link function:  logit
Mu Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.145803   0.052088  -2.799  0.00517 ** 
snr                       0.058113   0.006755   8.603  < 2e-16 ***
noise_genderM            -0.016676   0.063615  -0.262  0.79324    
algorithmB               -0.021673   0.064255  -0.337  0.73593    
algorithmC               -0.152696   0.060821  -2.511  0.01213 *  
noise_genderM:algorithmB  0.149942   0.089073   1.683  0.09245 .  
noise_genderM:algorithmC  0.199014   0.085822   2.319  0.02049 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

------------------------------------------------------------------
Sigma link function:  log
Sigma Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.82451    0.01527      54   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

------------------------------------------------------------------
No. of observations in the fit:  2144 
Degrees of Freedom for the fit:  8
      Residual Deg. of Freedom:  2136 
                      at cycle:  2 
 
Global Deviance:     -953.3476 
            AIC:     -937.3476 
            SBC:     -891.9842 
******************************************************************

Stepwise selection

Selected model for 0<y<1

\begin{align*} y_i&\sim\text{SIMPLEX}(\mu_i, \sigma_i)\\ \text{logit}(\mu_i)&=\beta_0^\mu+\beta^\mu _{\text{snr}}x_{\text{snr}, i}+&\text{snr}\\ &\qquad \beta^\mu _{\text{ng=M}}x_{\text{ng=M}, i}+&\text{noise gender}\\ &\qquad \beta^\mu _{\text{alg=B}}x_{\text{alg=B}, i}+\beta^\mu _{\text{alg=C}}x_{\text{alg=C}, i}+&\text{algorithm}\\ &\qquad \beta^\mu _{\text{MB}}x_{\text{ng=M}, i}x_{\text{alg=B}, i}+\beta^\mu _{\text{MC}}x_{\text{ng=M}, i}x_{\text{alg=C}, i}&\text{noise gender}\times\text{algorithm}\\ \log(\sigma_i) &=\beta_0^\sigma \end{align*}

Stepwise selection

Selected model for 0<y<1

Variable Parameter
\mu \sigma
snr \checkmark
noise gender \checkmark
algorithm \checkmark
noise gender \times algorithm \checkmark

Stepwise selection

2. Select model for y=0\,|\, 0<y<1\,|\, y=1

  • We fit a multinomial model to the 3-level categorical response

y=0 | 0<y<1 | y=1

Multinomial response

  • gamlss has distributions MN3, MN4, MN5 for 3-, 4- and 5-level nominal responses

MN3 model

\mathbb{P}(Y=y \mid \mu, \sigma)= \begin{cases}p_0 & \text { if } y=0 \\ p_1 & \text { if } y=1 \\ p_2 & \text { if } y=2 \end{cases} where

\begin{align*} p_0&=\frac{\mu}{1+\mu+\sigma}, \qquad p_1=\frac{\sigma}{1+\mu+\sigma}, \qquad \mu>0,\quad \sigma>0 \end{align*}

  • So we model the probabilities \mathbb{P}(y=0) and \mathbb{P}(y=1)

  • y=2 is the reference level

  • We want to model \xi_0 and \xi_1, so create a multinomial response with 0<y<1 as the reference level

Multinomial response

  • Create a multinomial variable:
speech |>
  mutate(
    zero_one = factor(case_when(
      y==0 ~ 0,
      y>0 & y<1 ~ 2,
      y==1 ~ 1))) -> speech

Stepwise selection

2. Model for y=0\,|\, 0<y<1\,|\, y=1

m.mult.0 <- gamlss(zero_one ~ snr, family=MN3, data=speech, trace=FALSE)

m.mult.step <- stepGAICAll.A(m.mult.0, 
                scope=list(lower=~1, 
                           upper=~ snr + noise_dir + noise_gender + algorithm +
                             algorithm*snr + algorithm*noise_dir + algorithm*noise_gender))
--------------------------------------------------- 
Distribution parameter:  mu 
Start:  AIC= 13763.48 
 zero_one ~ snr 

               Df   AIC
+ noise_gender  1 13701
+ algorithm     2 13760
<none>            13764
+ noise_dir     1 13765

Step:  AIC= 13701 
 zero_one ~ snr + noise_gender 

            Df   AIC
+ algorithm  2 13698
<none>         13701
+ noise_dir  1 13703

Step:  AIC= 13697.8 
 zero_one ~ snr + noise_gender + algorithm 

                         Df   AIC
+ snr:algorithm           2 13681
<none>                      13698
+ noise_gender:algorithm  2 13699
+ noise_dir               1 13700

Step:  AIC= 13681.44 
 zero_one ~ snr + noise_gender + algorithm + snr:algorithm 

                         Df   AIC
<none>                      13681
+ noise_dir               1 13683
+ noise_gender:algorithm  2 13685
--------------------------------------------------- 
Distribution parameter:  sigma 
Start:  AIC= 13681.44 
 ~1 

               Df   AIC
+ snr           1 13387
+ algorithm     2 13662
<none>            13681
+ noise_gender  1 13683
+ noise_dir     1 13683

Step:  AIC= 13386.64 
 ~snr 

               Df   AIC
+ algorithm     2 13362
+ noise_gender  1 13370
<none>            13387
+ noise_dir     1 13389
- snr           1 13681

Step:  AIC= 13362.52 
 ~snr + algorithm 

                Df   AIC
+ noise_gender   1 13346
+ snr:algorithm  2 13361
<none>             13362
+ noise_dir      1 13364
- algorithm      2 13387
- snr            1 13662

Step:  AIC= 13346.09 
 ~snr + algorithm + noise_gender 

                         Df   AIC
+ noise_gender:algorithm  2 13343
+ snr:algorithm           2 13346
<none>                      13346
+ noise_dir               1 13348
- noise_gender            1 13362
- algorithm               2 13370
- snr                     1 13664

Step:  AIC= 13343.08 
 ~snr + algorithm + noise_gender + algorithm:noise_gender 

                         Df   AIC
<none>                      13343
+ snr:algorithm           2 13343
+ noise_dir               1 13345
- algorithm:noise_gender  2 13346
- snr                     1 13664
--------------------------------------------------- 
Distribution parameter:  mu 
Start:  AIC= 13343.08 
 zero_one ~ snr + noise_gender + algorithm + snr:algorithm 

                Df   AIC
<none>             13343
- snr:algorithm  2 13359
- noise_gender   1 13368
--------------------------------------------------- 

Stepwise selection

2. Model for y=0\,|\, 0<y<1\,|\, y=1

summary(m.mult.step)
******************************************************************
Family:  c("MN3", "Multinomial 3 levels") 

Call:  gamlss(formula = zero_one ~ snr + noise_gender + algorithm +  
    snr:algorithm, sigma.formula = ~snr + algorithm +  
    noise_gender + algorithm:noise_gender, family = MN3,  
    data = speech, trace = FALSE) 

Fitting method: RS() 

------------------------------------------------------------------
Mu link function:  log
Mu Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     1.12510    0.08915  12.620  < 2e-16 ***
snr            -0.28711    0.02270 -12.650  < 2e-16 ***
noise_genderM  -0.33376    0.06474  -5.156 2.60e-07 ***
algorithmB     -0.31473    0.10549  -2.984 0.002858 ** 
algorithmC     -0.36384    0.10786  -3.373 0.000747 ***
snr:algorithmB  0.11234    0.02791   4.025 5.76e-05 ***
snr:algorithmC  0.02403    0.02949   0.815 0.415192    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

------------------------------------------------------------------
Sigma link function:  log
Sigma Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.71275    0.09202  -7.745 1.10e-14 ***
snr                       0.20758    0.01208  17.181  < 2e-16 ***
algorithmB               -0.16630    0.10449  -1.592  0.11153    
algorithmC               -0.56855    0.10438  -5.447 5.31e-08 ***
noise_genderM             0.11956    0.10052   1.189  0.23433    
algorithmB:noise_genderM  0.10949    0.13685   0.800  0.42370    
algorithmC:noise_genderM  0.35711    0.13803   2.587  0.00969 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

------------------------------------------------------------------
No. of observations in the fit:  6720 
Degrees of Freedom for the fit:  14
      Residual Deg. of Freedom:  6706 
                      at cycle:  6 
 
Global Deviance:     13315.08 
            AIC:     13343.08 
            SBC:     13438.46 
******************************************************************

Stepwise selection

Overall model

m.overall <- gamlssInf0to1(y = y, 
               mu.formula = ~ snr + noise_gender*algorithm,
               sigma.formula = ~1 , 
               xi0.formula = ~ snr*algorithm + noise_gender, 
               xi1.formula = ~ snr + noise_gender*algorithm,                     
               data=speech, family = SIMPLEX, n.cyc=50)


Variable Parameter
\mu \sigma \xi_0 \xi_1
snr \checkmark \checkmark \checkmark
noise gender \checkmark \checkmark \checkmark
algorithm \checkmark \checkmark \checkmark
noise gender \times algorithm \checkmark \checkmark
snr \times algorithm \checkmark

Stepwise selection

Overall model

wp(m.overall)

Parameter interpretation: pe_pdf


try(pe_pdf(m.overall,  term="algorithm",
       scenario = list("snr"=5, "noise_gender"="F"),
       x.values = c("A", "B", "C")))
Error in pe_pdf(m.overall, term = "algorithm", scenario = list(snr = 5,  : 
  Supply a standard GAMLSS model in obj


References

Hu, W., Swanson, B.A., & Heller, G.Z. (2015). A statistical method for the analysis of speech intelligibility tests. PloS One, 10(7), e0132409. DOI: 10.1371/journal.pone.0132409
Stasinopoulos, M.D., Kneib, T., Klein, N., Mayr, A., & Heller, G.Z. (2024). Generalized additive models for location, scale and shape: A distributional regression approach, with applications (Vol. 56). Cambridge University Press.